Skip to content

feat(action-brain): Action Brain MVP 0.1b — wacli collector, auto-ingest pipeline, checkpoint health (v0.10.2)#4

Closed
ab0991-oss wants to merge 20 commits intomasterfrom
staff/git-48-wacli-health-checks
Closed

feat(action-brain): Action Brain MVP 0.1b — wacli collector, auto-ingest pipeline, checkpoint health (v0.10.2)#4
ab0991-oss wants to merge 20 commits intomasterfrom
staff/git-48-wacli-health-checks

Conversation

@ab0991-oss
Copy link
Copy Markdown
Owner

@ab0991-oss ab0991-oss commented Apr 16, 2026

Summary

Full Action Brain MVP 0.1b — wacli collector, auto-ingest pipeline, and operational health layer.

Scheduler / Cron Integration

  • gbrain action run now exits non-zero when ingest fails/degrades — cron-friendly failure detection without JSON parsing
  • CLI entrypoint wrapped in import.meta.main guard for safe module import in tests

Source ID Correctness

  • Source IDs isolated per wacli store — prevents cross-store dedup collision
  • Ambiguous bare source IDs now fail closed instead of silently misattributing commitments

Wacli Health Checks + Auto-Ingest

  • Preflight wacli health check before every ingest run (healthy / degraded / failed)
  • Staleness gate: bails on data older than --stale-after-hours (default 24h)
  • --fail-on-degraded flag for strict cron enforcement

Collector + Checkpoint Pipeline

  • collector.ts reads wacli message store with checkpoint-aware cursor
  • Heartbeat freshness tracking — brief shows last sync even when no new messages arrive
  • Idempotent re-ingestion: createItemWithResult() returns created vs. skipped signal

Extraction Quality

  • Commitment actor normalization via stabilizeCommitments()
  • LLM retry on transient errors (overload, rate limit, timeout)
  • Owner context injection for correct actor attribution

Test Coverage

All new code paths have test coverage.

  • test/cli.test.ts — unit tests for operationResultIndicatesFailure()
  • test/cli-action-run.test.ts — process-level exit-code tests via Bun subprocess
  • test/fixtures/cli-action-run.preload.ts — mock module preload for isolated CLI testing
  • Tests: 647 pass, 126 skip (E2E requires DATABASE_URL), 0 in-branch failures

Pre-Landing Review

No issues found. All prior ship reviews on this branch clean (last: commit 385c7f7, quality score 8.5).

New delta (exit-code signaling commit): reviewed clean — 30 lines, well-scoped helper with full test coverage.

Greptile Review

No Greptile comments.

Plan Completion

All GIT-48 scope items implemented:

  • [DONE] wacli health checks before ingest
  • [DONE] stale/disconnected state detection and reporting
  • [DONE] scheduled execution path (gbrain action run cron-ready)
  • [DONE] freshness surfaced in brief and run summary
  • [DONE] alert/report when wacli stale >24h
  • [DONE] exit-code failure signaling for cron compatibility

TODOS

No TODO items completed in this PR (Action Brain work tracked in GIT-48).

Test plan

  • All unit tests pass (647 pass, 0 in-branch fail)
  • CLI exit-code tests pass (subprocess-level, 2 tests)
  • Wacli health + ingest pipeline tests pass
  • Source ID isolation + fail-closed tests pass

🤖 Generated with Claude Code

Documentation

  • CLAUDE.md: added test/cli-action-run.test.ts and test/fixtures/cli-action-run.preload.ts to test registry
  • CONTRIBUTING.md: added test/fixtures/ directory to project structure listing
  • CHANGELOG.md: fixed blank line spacing in Added section, no content changes

ab0991-oss and others added 14 commits April 16, 2026 22:45
Adds src/action-brain/collector.ts — deterministic wacli message reader
with file-based checkpoint store. Reads WhatsApp export files from the
wacli local store, deduplicates by message ID, and persists a checkpoint
so repeat runs only surface new messages since last sync.

Closes GIT-46.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
… (v0.10.2)

Adds src/action-brain/ingest-runner.ts — cron-ready auto-ingest pipeline
that reads new wacli messages, runs LLM extraction, and stores results.
Checkpoint-aware: skips already-processed messages. Staleness gate bails
if wacli data is older than --stale-after-hours (default 24h).

Also:
- action_engine: createItemWithResult() returns idempotency signal
- extractor: owner context injection for better extraction accuracy
- operations: action_brief reads checkpoint automatically; action_ingest_auto
  operation wires preflight + collect + extract + store in one call
- cli: `gbrain action run` command (checkpoint-path, stale-after-hours, wacli-limit flags)

Closes GIT-47.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Add collector.ts and ingest-runner.ts to Key files section
- Update action-engine.ts description (createItemWithResult idempotency)
- Update extractor.ts description (owner context injection)
- Update operations.ts count: 5 → 6 ops (adds action_ingest_auto)
- Add collector.test.ts and ingest-runner.test.ts to Testing section
- Update unit test file count: 33 → 35

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
…ation

Co-Authored-By: Paperclip <noreply@paperclip.ing>
…rmalization

Adds stabilizeCommitments() pipeline step to extractCommitments() so LLM
output gets message-grounded actor/source IDs on every extraction run.
Adds 295-line test suite covering actor reassignment, entity normalization,
and edge cases for the new stabilization path.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Add clarifying comment to resolveSourceMessage() explaining the intentional
  single-message LLM source_message_id fallback behavior
- Add comment to parseOptionalDate() explaining the intentional throw-on-bad-date
  safety gate (prevents checkpoint advancement on bad LLM output)
- Add comment to shouldPersistCheckpoint explaining the checkpoint write guard
- Add comment marking unreachable return [] in extractor retry loop
- Update CLAUDE.md extractor description: "two-tier Haiku→Sonnet" → accurate
  description (Sonnet default; quality gate uses Haiku→Sonnet escalation)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
- CHANGELOG.md: add missing stabilizeCommitments bullet to v0.10.2
- CLAUDE.md: add stabilizeCommitments to extractor description and extractor test coverage notes
- CONTRIBUTING.md: add collector.ts + ingest-runner.ts to Action Brain file tree, fix op count 5→6

Co-Authored-By: Paperclip <noreply@paperclip.ing>
… quality gate test

- Update CHANGELOG v0.10.2 release date to 2026-04-17
- Forward ownerName/ownerAliases/retryCount/throwOnError into quality gate extractor calls
- Add quality gate owner context test (65 tests pass)
- Expand e2e-live-validation.ts matcher with alias + type handling
- Add e2e-live-validation-metrics.test.ts for matchCommitment unit tests
- Add P2 TODOs: shared utils refactor + N+1 fix (identified in pre-landing review)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
@ab0991-oss ab0991-oss changed the title feat(action-brain): wacli health checks + replay stabilization (v0.10.2) feat(action-brain): Action Brain MVP 0.1b — wacli collector, auto-ingest pipeline, checkpoint health (v0.10.2) Apr 16, 2026
ab0991-oss and others added 6 commits April 17, 2026 01:19
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
- operationResultIndicatesFailure() sets non-zero exit code when
  action_ingest_auto returns success=false — cron/scheduler can now
  reliably detect degraded/unhealthy runs without parsing JSON
- Wrap CLI entrypoint in if (import.meta.main) guard so the module
  can be safely imported in tests without auto-executing
- Added unit coverage for operationResultIndicatesFailure() helper
- Added process-level exit-code tests via bun --preload fixture
  (test/cli-action-run.test.ts + test/fixtures/cli-action-run.preload.ts)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
… fixes

Co-Authored-By: Paperclip <noreply@paperclip.ing>
- CLAUDE.md: add test/cli-action-run.test.ts and test/fixtures/cli-action-run.preload.ts to test registry
- CONTRIBUTING.md: add test/fixtures/ directory to project structure listing
- CHANGELOG.md: fix blank line spacing in Added section

Co-Authored-By: Paperclip <noreply@paperclip.ing>
ab0991-oss pushed a commit that referenced this pull request Apr 19, 2026
* feat: add minion_jobs schema, migration v5, and executeRaw to BrainEngine

Foundation for the Minions job queue system. Adds:
- minion_jobs table (20 columns) with CHECK constraints, partial indexes,
  and RLS. Inspired by BullMQ's job model, adapted for Postgres.
- Migration v5 creates the table for existing databases.
- executeRaw<T>() method on BrainEngine interface for raw SQL access,
  needed by the Minions module for claim queries (FOR UPDATE SKIP LOCKED),
  token-fenced writes, and atomic stall detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions job queue — queue, worker, backoff, types

BullMQ-inspired Postgres-native job queue built into GBrain. No Redis.
No external dependencies. Postgres transactions replace Lua scripts.

- MinionQueue: submit, claim (FOR UPDATE SKIP LOCKED), complete/fail
  (token-fenced), atomic stall detection (CTE), delayed promotion,
  parent-child resolution, prune, stats
- MinionWorker: handler registry, lock renewal, graceful SIGTERM,
  exponential backoff with jitter, UnrecoverableError bypass
- MinionJobContext: updateProgress(), log(), isActive() for handlers
- 8-state machine: waiting/active/completed/failed/delayed/dead/
  cancelled/waiting-children

Patterns stolen from: BullMQ (lock tokens, stall detection, flows),
Sidekiq (dead set, backoff formula), Inngest (checkpoint/resume).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 43 tests for Minions job queue

Full coverage of the Minions module against PGLite in-memory:
- Queue CRUD (9): submit, get, list, remove, cancel, retry, duplicate
- State machine (6): waiting→active→completed/failed, retry→delayed→waiting
- Backoff (4): exponential, fixed, jitter range, attempts_made=0 edge
- Stall detection (3): detect stalled, counter increment, max→dead
- Dependencies (5): parent waits, fail_parent, continue, remove_dep, orphan
- Worker lifecycle (5): register, start-without-handlers, claim+execute,
  non-Error throws, UnrecoverableError bypass
- Lock management (3): renewal, token mismatch, claim sets lock fields
- Claim mechanics (4): empty queue, priority ordering, name filtering,
  delayed promotion timing
- Cancel & retry (2): cancel active, retry dead

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions CLI commands and MCP operations

Wire Minions into the GBrain CLI and MCP layer:

CLI (gbrain jobs):
  submit <name> [--params JSON] [--follow] [--dry-run]
  list [--status S] [--queue Q] [--limit N]
  get <id> — detailed view with attempt history
  cancel/retry/delete <id>
  prune [--older-than 30d]
  stats — job health dashboard
  work [--queue Q] [--concurrency N] — Postgres-only worker daemon

6 MCP operations (contract-first, auto-exposed via MCP server):
  submit_job, get_job, list_jobs, cancel_job, retry_job, get_job_progress

Built-in handlers: sync, embed, lint, import. --follow runs inline.
Worker daemon blocked on PGLite (exclusive file lock).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for Minions job queue

CLAUDE.md: added Minions files to key files, updated operation count (36),
BrainEngine method count (38), test file count (45), added jobs CLI commands.
CHANGELOG.md: added Minions entry to v0.10.0 (background jobs, retry, stall
detection, worker daemon).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions v2 — agent orchestration primitives (pause/resume, inbox, tokens, replay)

Adds the foundation for Minions as universal agent orchestration infrastructure.
GBrain's Postgres-native job queue now supports durable, observable, steerable
background agents. The OpenClaw plugin (separate repo) will consume these via
library import, not MCP, for zero-latency local integration.

## New capabilities

- **Concurrent worker** — Promise pool replaces sequential loop. Per-job
  AbortController for cooperative cancellation. Graceful shutdown waits for
  all in-flight jobs via Promise.allSettled.
- **Pause/resume** — pauseJob clears the lock and fires AbortSignal on active
  jobs. Handlers check ctx.signal.aborted and exit cleanly. resumeJob returns
  paused jobs to waiting. Catch block skips failJob when signal.aborted.
- **Inbox (separate table)** — minion_inbox table for sidechannel messages.
  sendMessage with sender validation (parent job or admin). readInbox is
  token-fenced and marks read_at atomically. Separate table avoids row bloat
  from rewriting JSONB on every send.
- **Token accounting** — tokens_input/tokens_output/tokens_cache_read columns.
  updateTokens accumulates; completeJob rolls child tokens up to parent.
  USD cost computed at read time (no cost_usd column — pricing too volatile).
- **Job replay** — replayJob clones a terminal job with optional data overrides.
  New job, fresh attempts, no parent link.

## Handler contract additions

MinionJobContext now provides:
- `signal: AbortSignal` — cooperative cancellation
- `updateTokens(tokens)` — accumulate token usage
- `readInbox()` — check for sidechannel messages
- `log()` — now accepts string or TranscriptEntry

## MCP operations added

pause_job, resume_job, replay_job, send_job_message — all auto-generate CLI
commands and MCP server endpoints.

## Library exports

package.json exports map adds ./minions and ./engine-factory paths so plugins
can `import { MinionQueue } from 'gbrain/minions'` for direct library use.

## Instruction layer (the teaching)

- skills/minion-orchestrator/SKILL.md — when/how to use Minions, decision
  matrix, lifecycle management, anti-patterns
- skills/conventions/subagent-routing.md — cross-cutting rule: all background
  work goes through Minions
- RESOLVER.md — trigger entries for agent orchestration
- manifest.json — registered

## Schema migration v6

Additive: 3 token columns, paused status, minion_inbox table with unread index.
Full Postgres + PGLite support. No backfill needed.

## Tests

65 tests (was 43): pause/resume (5), inbox (6), tokens (4), replay (4),
concurrent worker context (3), plus all existing coverage.

## What's NOT in this commit

Deferred to follow-up PRs:
- LISTEN/NOTIFY subscribe (needs real Postgres E2E)
- Resource governor (depends on concurrent worker stress testing)
- Routing eval harness (needs API keys + benchmark data)
- OpenClaw plugin (separate @gbrain/openclaw-minions-plugin repo)

See docs/designs/MINIONS_AGENT_ORCHESTRATION.md for full CEO-approved design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): migration v7 — agent_parity_layer schema

Adds columns on minion_jobs (depth, max_children, timeout_ms, timeout_at,
remove_on_complete, remove_on_fail, idempotency_key) plus the new
minion_attachments table. Three partial indexes for bounded scans:
idx_minion_jobs_timeout, idx_minion_jobs_parent_status, and
uniq_minion_jobs_idempotency. Check constraints enforce non-negative depth
and positive child cap / timeout.

Additive migration — existing installs pick it up via ensureSchema on next
use. No user action required.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): extend types for v7 parity layer

Extends MinionJob with depth/max_children/timeout_ms/timeout_at/
remove_on_complete/remove_on_fail/idempotency_key. Extends MinionJobInput
with the same options plus max_spawn_depth override. Adds MinionQueueOpts
(maxSpawnDepth default 5, maxAttachmentBytes default 5 MiB). Adds
AttachmentInput/Attachment shapes and ChildDoneMessage in the InboxMessage
union. rowToMinionJob updated to pick up the new columns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): attachments validator

New module validateAttachment() gates every attachment write. Rejects empty
filenames, path traversal (.., /, \), null bytes, oversized content (5 MiB
default, per-queue override), invalid base64, and implausible content_type
headers. Returns normalized { filename, content_type, content (Buffer),
sha256, size } on success.

The DB also enforces UNIQUE (job_id, filename) as defense-in-depth for
concurrent addAttachment races — JS-only checks are not sufficient.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): queue v7 — depth, child cap, timeouts, cascade, idempotency, child_done

Wraps completeJob and failJob in engine.transaction() so parent hook
invocations (resolveParent, failParent, removeChildDependency) fold into
the same transaction as the child update. A process crash between child
and parent can't strand the parent in waiting-children anymore.

Adds v7 behaviors:
- Depth tracking. add() computes depth = parent.depth + 1 and rejects
  past maxSpawnDepth (default 5).
- Per-parent child cap. add() takes SELECT ... FOR UPDATE on the parent,
  counts non-terminal children, rejects when count >= max_children.
  NULL max_children = no cap.
- Per-job wall-clock timeout. claim() populates timeout_at when
  timeout_ms is set. New handleTimeouts() dead-letters expired rows with
  error_text='timeout exceeded'. Terminal — no retry.
- Cascade cancel. cancelJob() walks descendants via recursive CTE with
  depth-100 runaway cap. Returns the root row. Re-parented descendants
  (parent_job_id NULL) are naturally excluded.
- Idempotency. add() uses INSERT ... ON CONFLICT (idempotency_key) DO
  NOTHING RETURNING; falls back to SELECT when RETURNING is empty. Same
  key always yields the same job id.
- child_done inbox. completeJob inserts {type:'child_done', child_id,
  job_name, result} into the parent's inbox in the same transaction as
  the token rollup, guarded by EXISTS so terminal/deleted parents skip
  without FK violation. New readChildCompletions(parent_id, lock_token,
  since?) helper; token-fenced like readInbox.
- removeOnComplete / removeOnFail. Deletes the row after the parent hook
  fires, so parent policy sees consistent state.
- Attachment methods. addAttachment validates via validateAttachment
  then INSERTs; UNIQUE (job_id, filename) backs the JS dup check.
  listAttachments, getAttachment, deleteAttachment round out the API.

Fixes pre-existing inverted status bug: add() now puts children in
waiting/delayed (not waiting-children) and atomically flips the parent
to waiting-children in the same transaction. Tests no longer need
manual UPDATE workarounds.

Two correctness fixes:
- Sibling completion race. Under READ COMMITTED, two grandchildren
  completing concurrently each saw the other as still-active in the
  pre-commit snapshot and neither flipped the parent. Fixed by taking
  SELECT ... FOR UPDATE on the parent row at the start of completeJob
  and failJob transactions, serializing siblings on the parent lock.
- JSONB double-encode. postgres.js conn.unsafe(sql, params) auto-
  JSON-encodes parameters. Calling JSON.stringify(obj) first stored a
  JSON string literal (jsonb_typeof=string) and broke payload->>'key'
  queries silently. Removed JSON.stringify from three call sites
  (child_done inbox post, updateProgress, sendMessage). PGLite tolerated
  both forms so unit tests missed it — real-PG E2E caught it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): worker — timeout safety net + handleTimeouts tick

Worker tick now calls handleStalled() first, then handleTimeouts() — stall
requeue wins over timeout dead-letter when both could fire in the same
cycle. handleTimeouts() guards on lock_until > now() so stalled jobs take
the retryable path.

launchJob schedules a per-job setTimeout(timeout_ms) that fires ctx.signal
as a best-effort handler interrupt. The timer is always cleared in .finally
so process exit isn't delayed by a dangling timer. Handlers that respect
AbortSignal stop cleanly; handlers that ignore it still get dead-lettered
by the DB-side handleTimeouts.

Removed post-completeJob and post-failJob parent-hook calls from the worker
— those are now inside the queue method transactions. Worker becomes
simpler and crash-safer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(minions): 33 new unit tests for v7 parity layer

Covers depth cap, per-parent child cap, timeout dead-letter, cascade
cancel (including the re-parent edge case), removeOnComplete /
removeOnFail, idempotency (single + concurrent), child_done inbox
(posted in txn + survives child removeOnComplete + since cursor),
attachment validation (oversize, path traversal, null byte, duplicates,
base64), AbortSignal firing on pause mid-handler, catch-block skipping
failJob when aborted, worker in-flight bookkeeping, token-rollup guard
when parent already terminal, and setTimeout safety-net cleanup.

Existing tests updated to remove the inverted-status manual UPDATE
workarounds that the add() fix made obsolete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(e2e): Minions v7 concurrency + OpenClaw resilience coverage

minions-concurrency.test.ts spins two MinionWorker instances against the
test Postgres, submits 20 jobs, and asserts zero double-claims (every job
runs exactly once). This is the only test that actually proves FOR UPDATE
SKIP LOCKED under real concurrency — PGLite runs on a single connection
and can't exercise the race.

minions-resilience.test.ts covers the six OpenClaw daily pains:
1. Spawn storm caps enforce under concurrent submit. 2. Agent stall →
handleStalled() requeues; handleTimeouts() skips (lock_until guard).
3. Forgotten dispatches recoverable via child_done inbox. 4. Cascade
cancel stops grandchildren mid-flight. 5. Deep tree fan-in
(parent → 3 children → 2 grandchildren each) completes with the full
inbox chain. 6. Parent crash/recovery resumes from persisted state.

helpers.ts extends ALL_TABLES with minion_attachments, minion_inbox, and
minion_jobs (FK dependents first) so E2E teardown doesn't leak rows
between runs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: release v0.11.0 — Minions v7 agent orchestration primitives

Bumps VERSION / package.json to 0.11.0. Adds CHANGELOG entry covering
depth tracking, max_children, per-job timeouts, cascade cancel,
idempotency keys, child_done inbox, removeOnComplete/Fail, attachments,
migration v7, plus the two correctness fixes (sibling completion race
and JSONB double-encode).

TODOS.md captures the four v7 follow-ups: per-queue rate limiting,
repeat/cron scheduler, worker event emitter, and waitForChildren
convenience helpers.

1066 unit + 105 E2E = 1171 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): unify JSONB inserts, tighten nullish coalescing

Three non-blocker cleanups from post-ship review of v0.11.0:

- queue.ts add() and completeJob(): pre-stringifying with JSON.stringify
  while other sites pass raw objects with $n::jsonb casts. postgres.js
  double-encodes if you stringify first — works on PGLite (text→JSONB
  auto-cast), fails silently on real PG. Unify on raw object + explicit
  $n::jsonb cast.
- queue.ts readChildCompletions: since clause used sent_at > $2 relying
  on PG's implicit text→TIMESTAMPTZ coercion. Explicit $2::timestamptz
  is safer and clearer.
- types.ts rowToMinionJob: parent_job_id used || which coerces 0 to null.
  Harmless today (SERIAL IDs start at 1) but ?? is semantically correct.

All 110 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): updateProgress missed $1::jsonb cast in unification

Residual from c502b7e — updateProgress was the only remaining JSONB write
without the explicit ::jsonb cast. Not broken (implicit cast works) but
breaks the convention the prior commit unified everywhere else.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc: Minions v7 skill count + jobs subcommands (26 skills)

README: bump skill count 25 → 26, add minion-orchestrator row, add
`gbrain jobs` command family block so v0.11.0's headline feature is
actually discoverable from the top-level commands reference.

CLAUDE.md: unit test count 48 → 49 (minions.test.ts expanded), skill
count 25 → 26, add minion-orchestrator to Key files + skills categorization,
expand MinionQueue one-liner to cover v7 primitives (depth/child-cap,
timeouts, idempotency, child_done inbox, removeOnComplete/Fail).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: Minions adoption UX — smoke test + migration + pain-triggered routing

Teach OpenClaw when to reach for Minions vs native subagents. Ship three
pieces so upgrading from v0.10.x actually lands for real users:

- `gbrain jobs smoke` — one-command health check that submits a `noop` job,
  runs a worker, verifies completion, and prints engine-aware guidance
  (PGLite installs get the "daemon needs Postgres, use --follow" note).
  Fails loud if schema's below v7 so the user knows to `gbrain init`.

- `skills/migrations/v0.11.0.md` — post-upgrade migration file the
  auto-update agent reads. Six steps: apply schema, run smoke, ask user
  via AskUserQuestion which mode they want (always / pain_triggered / off),
  write to `~/.gbrain/preferences.json`, sanity-check handlers, mark done.
  Completeness scores on each option so the recommendation is explicit.

- `skills/conventions/subagent-routing.md` rewritten — was a "MUST use
  Minions for ALL background work" mandate, now reads preferences.json
  on every routing decision and branches on three modes. Mode B
  (pain_triggered) is the default: keep subagents until gateway drops
  state, parallel > 3, runtime > 5min, or user expresses frustration.
  Then pitch the switch in-session with a specific script.

Rename pass: "Minions v7" → "Minions" in README (JOBS block), TODOS.md
(P1 section header + depends-on), CHANGELOG.md v0.11.0 entry. v7 stays
as the internal schema version in code/migration contexts. The product
name is just Minions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc(readme): promote Minions — 6 OpenClaw pains + how each is fixed

The one-line mention in the skills table wasn't doing the work. Added a
dedicated section between "How It Works" and "Getting Data In" that leads
with the six multi-agent failures every OpenClaw user hits daily (spawn
storms, hung handlers, forgotten dispatches, unstructured debugging,
gateway crashes, runaway grandchildren) and maps each pain to the
specific Minions primitive that fixes it.

Includes the smoke test command, the adoption default (pain_triggered),
and a pointer to skills/minion-orchestrator for the full patterns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(bench): add harness for Minions vs OpenClaw subagent dispatch

Shared harness (openclawDispatch + minionsHandler) using matching
claude-haiku-4-5 calls on both sides so the delta measures queue+
dispatch overhead on top of identical LLM work. Includes
statsFromResults (p50/p95/p99) and formatStats helpers. Uses
`openclaw agent --local` embedded mode; does not test gateway
multi-agent fan-out (documented in the harness header).

* test(bench): durability under SIGKILL — Minions vs OpenClaw --local

Headline bench for the claim: when the orchestrator dies mid-dispatch,
Minions rescues via PG state + stall detection; OpenClaw --local loses
in-flight work outright.

Minions side: seed 10 active+expired-lock rows (exact state a SIGKILLed
worker leaves) then run a rescue worker. Expect 10/10 completed.
OpenClaw side: spawn 10 `openclaw agent --local` in parallel, SIGKILL
each at 500ms, count pre-kill delivered output. Expect 0/10 — no
persistence layer, nothing to recover.

Budget: ~$0 (Minions handlers sleep 10ms; OC calls die at 500ms so
partial LLM billing is negligible).

* test(bench): per-dispatch throughput — Minions vs OpenClaw --local

20 serial dispatches each side, identical claude-haiku-4-5 call with the
same trivial prompt. p50/p95/p99 reported via statsFromResults. Serial
(not parallel) so the per-dispatch cost is measured honestly and LLM
token spend stays bounded (~$0.08 total).

Minions: one queue, one worker, one concurrency. Submit → poll to
completion before next submit. OpenClaw: N sequential
`openclaw agent --local` spawns.

* test(bench): fan-out — Minions 10-wide concurrency vs 10 parallel OC spawns

Parent dispatches 10 children, waits for all to return. Minions uses
worker concurrency=10 sharing one warm process; OpenClaw parallel
`openclaw agent --local` spawns, each boots its own runtime.

3 runs × 10 children per run. Reports ok count and wall time per run
plus summary. Honest caveat documented: does not test OC gateway
multi-agent fan-out — that needs a custom WS client and LLM-backed
parent agent. This measures what users script today.

Budget: ~$0.12 LLM spend.

* test(bench): memory — 10 in-flight subagents, single-proc vs 10-proc cost

Measures resident memory for keeping 10 subagents in flight. Minions:
one worker process, concurrency=10 with handlers that park on a
promise — sample RSS of the test process via process.memoryUsage().
OpenClaw: 10 parallel `openclaw agent --local` processes, sum their
RSS via `ps -o rss=`.

Handlers are cheap sleeps, no LLM — we want harness memory, not LLM
client state. Budget: $0.

* test(bench): fan-out — don't gate on OC success rate, report numbers

Initial run showed OC parallel `--local` at 10-wide hits 40% failure
rate (17/30 across 3 runs). That's the finding, not a test bug —
process startup stampede + LLM rate limits. Bench now prints error
samples and reports the numbers instead of gating.

Minions side still gates at 90% (30/30 observed in practice).

* doc(benchmarks): Minions vs OpenClaw --local subagent dispatch

Real numbers on four claims: durability, throughput, fan-out, memory.
Same claude-haiku-4-5 call on both sides so the delta is queue+dispatch+
process cost on top of identical LLM work.

Headline: Minions rescues 10/10 from a SIGKILLed worker in 458ms while
OpenClaw --local loses all 10; ~10× faster per dispatch (778ms p50 vs
8086ms p50); ~21× faster at 10-wide fan-out AND 100% reliable vs OC's
43% failure rate; 2 MB vs 814 MB to keep 10 subagents in flight.

Honest caveats section covers what this doesn't test (OC gateway
multi-agent, load tests, other models). Fully reproducible via
test/e2e/bench-vs-openclaw/.

* doc(readme): inject Minions vs OpenClaw bench numbers

Headline deltas now in the Minions section: 10/10 vs 0/10 on crash,
~10× faster per dispatch, ~21× faster fan-out at 10-wide with 0%
failure vs 43%, ~400× less memory. Links to the full bench doc.

Prose first said Minions "fixes all six pains." Now it shows the
numbers that prove it.

* bench: production Wintermute benchmark — Minions 753ms vs sub-agent timeout

Real deployment: 45K-page brain on Render+Supabase. Task: pull 99 tweets,
write brain page, commit, sync. Minions: 753ms, $0. Sub-agent: gateway
timeout (>10s, couldn't even spawn under production load).

Also: 19,240 tweets backfilled across 36 months in 15 min at $0.
Sub-agents would cost $1.08 and fail 40% of spawns.

* bench: tweet ingestion — Minions 719ms vs OpenClaw 12.5s (17×)

Production benchmark with runnable test code:
- test/e2e/bench-vs-openclaw/tweet-ingest.bench.ts (reusable)
- docs/benchmarks/2026-04-18-tweet-ingestion.md (publishable)

Task: pull 100 tweets from X API, write brain page, commit, sync.
Minions: 719ms mean, $0, 100% success.
OpenClaw: 12,480ms mean, $0.03/run, 60% success (gateway timeouts).
At scale: 36-month backfill, 19K tweets, 15 min, $0 vs est. $1.08.

* doc(benchmarks): Wintermute production data point for Minions vs OpenClaw

Adds a production-environment data point to the Minions README section:
one month of tweet ingest on Wintermute (Render + Supabase + 45K-page brain)
ran end-to-end in 753ms for \$0.00 via Minions, while the equivalent
sessions_spawn hit the 10s gateway timeout and produced nothing.

Full methodology + logs in docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): preferences.ts + cli-util.ts — foundations for v0.11.1

Adds two foundational modules that apply-migrations (Lane A-4), the
v0.11.0 orchestrator (Lane C-1), and the stopgap script (Lane C-4) all
depend on.

- src/core/preferences.ts: atomic-write ~/.gbrain/preferences.json
  (mktemp + rename, 0o600, forward-compatible for unknown keys) with
  validateMinionMode, loadPreferences, savePreferences. Plus
  appendCompletedMigration + loadCompletedMigrations for the
  ~/.gbrain/migrations/completed.jsonl log (tolerates malformed lines).
  Uses process.env.HOME || homedir() so $HOME overrides work in CI and
  tests; Bun's os.homedir() caches the initial value and ignores later
  mutations.
- src/core/cli-util.ts: promptLine(prompt) helper, extracted from
  src/commands/init.ts:212-224. Shared so init, apply-migrations, and
  the v0.11.0 orchestrator's mode prompt don't each reinvent it.

test/preferences.test.ts: 21 unit tests covering load/save atomicity,
0o600 perms, forward-compat for unknown keys, minion_mode validation,
completed.jsonl JSONL append idempotence, auto-ts population, malformed-
line tolerance in loadCompletedMigrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(init): add --migrate-only flag (schema-only, no saveConfig)

Context: v0.11.0 migration orchestrators need a safe way to re-apply the
schema against an existing brain without risking a config flip. Today
running bare `gbrain init` with no flags defaults to PGLite and calls
saveConfig, which would silently overwrite an existing Postgres
database_url — caught by Codex in the v0.11.1 plan review as a
show-stopper data-loss bug.

The new --migrate-only path:
  - loadConfig() reads the existing config (does NOT call saveConfig)
  - errors out with a clear "run gbrain init first" if no config exists
  - connects via the already-configured engine, calls engine.initSchema(),
    disconnects
  - --json emits structured success/error payloads

Everything downstream in the v0.11.1 migration chain (apply-migrations,
the stopgap bash script, the package.json postinstall hook) will invoke
this flag rather than bare gbrain init.

test/init-migrate-only.test.ts: 4 tests covering the no-config error
path, --json error payload shape, happy-path with a PGLite fixture
(verifies config.json content is byte-identical after the call — the
real invariant), and idempotent rerun.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): TS registry replaces filesystem migration scan

Context: Codex flagged that bun build --compile produces a self-contained
binary, and the existing findMigrationsDir() in upgrade.ts:145 walks
skills/migrations/v*.md on disk — which fails on a compiled install
because the markdown files aren't bundled. The plan's fix is a TS
registry: migrations are code, imported directly, visible to both source
installs and compiled binaries.

- src/commands/migrations/types.ts: shared Migration, OrchestratorOpts,
  OrchestratorResult types.
- src/commands/migrations/index.ts: exports the migrations[] array,
  getMigration(version), and compareVersions() (semver comparator).
  The feature_pitch data that lived in the MD file frontmatter now
  lives here as a code constant on each Migration, so runPostUpgrade's
  post-upgrade pitch printer can consume it without a filesystem read.
- src/commands/migrations/v0_11_0.ts: stub orchestrator + pitch. The
  full phase implementation lands in Lane C-1; for now the stub throws
  a clear "not yet implemented" so apply-migrations --list (Lane A-4)
  can still enumerate the migration.

test/migrations-registry.test.ts: 9 tests covering ascending-semver
ordering, feature_pitch shape invariants, getMigration lookup, and
compareVersions edge cases (equal / newer / older / single-digit
across major bumps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): gbrain apply-migrations — migration runner CLI

Reads ~/.gbrain/migrations/completed.jsonl, diffs against the TS migration
registry, runs pending orchestrators. Resumes status:"partial" entries
(the stopgap bash script writes these so v0.11.1 apply-migrations can
pick up where it left off). Idempotent: rerunning when up-to-date exits 0.

Flags:
  --list                    Show applied + partial + pending + future.
  --dry-run                 Print the plan; take no action.
  --yes / --non-interactive Skip prompts (used by runPostUpgrade + postinstall).
  --mode <a|p|o>            Preset minion_mode (bypasses the Phase C TTY prompt).
  --migration vX.Y.Z        Force-run one specific version.
  --host-dir <path>         Include $PWD in host-file walk (default is
                            $HOME/.claude + $HOME/.openclaw only).
  --no-autopilot-install    Skip Phase F.

Diff rule (Codex H9): apply when no status:"complete" entry exists AND
migration.version ≤ installed VERSION. Previously proposed rule was
"version > currentVersion", which would SKIP v0.11.0 when running v0.11.1;
regression test in apply-migrations.test.ts pins the correct semantics.

Registered in src/cli.ts CLI_ONLY Set; dispatched before connectEngine so
each phase owns its own engine/subprocess lifecycle (no double-connect
when the orchestrator shells out to init --migrate-only or jobs smoke).

test/apply-migrations.test.ts: 18 unit tests covering parseArgs for every
flag, indexCompleted/statusForVersion correctness (including stopgap-then-
complete transition), and buildPlan's four buckets (applied / partial /
pending / skippedFuture) with the Codex H9 regression pinned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(upgrade): runPostUpgrade tail-calls apply-migrations; postinstall hook

Closes the v0.11.0 mega-bug: migration skills never fired on upgrade.
`runPostUpgrade` now does two things:

  1. Cosmetic: prints feature_pitch headlines for migrations newer than
     the prior binary. Uses the TS registry (Codex K) instead of walking
     skills/migrations/*.md on disk — compiled binaries see the same list
     source installs do.
  2. Mechanical: invokes apply-migrations --yes --non-interactive in the
     same process so Phase F (autopilot install) doesn't hit a subprocess
     timeout wall. Catches + surfaces errors without failing the upgrade.

Also:
  - Drops the early-return on missing upgrade-state.json (Codex H8).
    runPostUpgrade now runs apply-migrations unconditionally; it's cheap
    when nothing is pending. This repairs every broken-v0.11.0 install on
    their next upgrade attempt.
  - Bumps the `gbrain post-upgrade` subprocess timeout in runUpgrade from
    30s → 300s (Codex H7). A v0.11.0→v0.11.1 migration that has to
    schema-init + smoke + prefs + host-rewrite + launchd-install exceeds
    30s trivially.
  - Removes now-dead findMigrationsDir + extractFeaturePitch helpers and
    their filesystem-reading imports (readdirSync, resolve).
  - src/cli.ts post-upgrade dispatch now awaits the async runPostUpgrade.

apply-migrations (Lane A-4):
  - First-install guard: loadConfig() check at the top. No brain
    configured = exit silently for --yes / --non-interactive (postinstall
    stays quiet on fresh `bun add gbrain`); explicit message on --list /
    --dry-run.

package.json:
  - New `postinstall` script: gbrain --version >/dev/null 2>&1 && gbrain
    apply-migrations --yes --non-interactive 2>/dev/null || true. The
    --version sanity check guards against a half-written binary (Codex
    review criticism). || true prevents `bun update gbrain` failure
    mid-upgrade.

Manual smoke verified: fresh $HOME with no config → apply-migrations
--yes silently exits 0; --dry-run prints the one-liner "No brain
configured... Nothing to migrate."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commands): extract library-level Core functions that throw not exit

Codex architecture finding #5: reusing CLI entry-point functions as Minions
handler bodies is wrong. If a Minion invokes runExtract / runEmbed /
runBacklinks / runLint and the handler hits a process.exit(1), the ENTIRE
WORKER process dies — killing every other in-flight job. Handlers need
library-level APIs that throw, and the CLI stays a thin wrapper that
catches + exits.

Per-command shape:
  - runXxxCore(opts): throws on validation errors, returns structured
    result. Handler-safe.
  - runXxx(args): arg parser; calls Core; catches; process.exit(1) on
    thrown errors. CLI-safe.

Shipped:
  - runExtractCore({ mode, dir, dryRun?, jsonMode? }) → ExtractResult
  - runEmbedCore({ slug? | slugs? | all? | stale? }) → void
  - runBacklinksCore({ action, dir, dryRun? }) → BacklinksResult
  - runLintCore({ target, fix?, dryRun? }) → LintResult

sync.ts is already correct — performSync throws; runSync wraps. No change.

import.ts deferred to v0.12.0 (its one process.exit fires only on a
missing dir arg; handlers always pass a dir, so worker-kill risk is
zero in practice). Noted in the plan's Out-of-scope.

Smoke verified: all four Core functions throw on invalid mode / missing
dir / not-found target instead of exiting the process.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(jobs): Tier 1 handlers + autopilot-cycle (the killer handler)

registerBuiltinHandlers now handlers every operation autopilot needs to
dispatch via Minions + the single autopilot-cycle handler the autopilot
loop actually submits each interval.

Existing handlers (sync, embed, lint) rewired to call library-level Core
functions directly instead of the CLI wrappers. CLI wrappers call
process.exit(1) on validation errors; if a worker claimed a badly-formed
job, the WORKER PROCESS would die — killing every in-flight job. Cores
throw, so one bad job fails one job.

New handlers:
  - extract  → runExtractCore (mode: links|timeline|all, dir)
  - backlinks → runBacklinksCore (action: check|fix, dir)
  - autopilot-cycle → THE killer handler. Runs sync → extract → embed →
    backlinks inline. Each step wrapped in try/catch; returns
    { partial: true, failed_steps: [...] } when any step fails. Does NOT
    throw on partial failure — that would trigger Minion retry, and an
    intermittent extract bug would block every future cycle. Replaces
    the 4-job parent-child DAG proposed in early plan drafts (Codex
    H3/H4: parent/child is NOT a depends_on primitive in Minions).

import.ts handler still uses the CLI wrapper (runImport) — import's one
process.exit fires only on a missing dir arg and the handler always
passes a dir; Core extraction deferred to v0.12.0 when Tier 2 refactors
happen.

registerBuiltinHandlers promoted from private to exported for testability.

test/handlers.test.ts: 4 tests. Asserts every expected handler name
registers. Asserts autopilot-cycle against a nonexistent repo returns
{ partial: true, failed_steps: ['sync', 'extract', 'backlinks'] } — does
NOT throw. Asserts autopilot-cycle against an empty (but real) git repo
returns a result with a steps map, never throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(autopilot): Minions dispatch + worker spawn supervisor + async shutdown

Autopilot now dispatches each cycle as a single `autopilot-cycle` Minion
job (with idempotency_key on the cycle slot) instead of running steps
inline. A forked `gbrain jobs work` child drains the queue durably,
supervised by autopilot. The user runs ONE install step
(`gbrain autopilot --install`) and gets sync + extract + embed + backlinks
+ durable job processing, with no separate worker daemon to manage.

Mode selection:
  - minion_mode=always OR pain_triggered (default), engine=postgres →
    Minions dispatch. Spawn child, submit autopilot-cycle each interval.
  - minion_mode=off, OR engine=pglite, OR `--inline` flag → run steps
    inline in-process, same as pre-v0.11.1. PGLite has an exclusive file
    lock that blocks a second worker process, so the inline path is the
    only path that works there.

Worker supervision:
  - spawn(resolveGbrainCliPath(), ['jobs', 'work'], { stdio: 'inherit' }).
    stdio:'inherit' avoids pipe-buffer blocking (Codex architecture #2).
  - On worker exit: 10s backoff + restart. Crash counter caps at 5 →
    autopilot stops with a clear error.
  - resolveGbrainCliPath() prefers argv[1] (cli.ts / /gbrain), then
    process.execPath (compiled binary suffix check), then `which gbrain`
    (installed to $PATH). NEVER blindly uses process.execPath, which on
    source installs is the Bun runtime, not `gbrain` (Codex architecture
    #1).

Shutdown:
  - Async SIGTERM/SIGINT handler: sends SIGTERM to worker, awaits its
    exit for up to 35s (the worker's own drain is 30s; we add buffer for
    signal-delivery latency), then SIGKILL if still alive.
  - Drops the old `process.on('exit')` lock-cleanup handler — its
    callback runs synchronously and can't wait for the worker drain.
    Lock file cleanup moved inside the async shutdown.

Lock-file mtime refresh every cycle (Codex C) so a long-lived autopilot
doesn't get declared "stale" by the next cron-fired invocation after 10
minutes.

Inline fallback path calls the new Core fns (runExtractCore, runEmbedCore)
instead of the CLI wrappers. That way a bad arg from inside the loop
can't process.exit() the autopilot itself (matches Codex #5).

test/autopilot-resolve-cli.test.ts: 3 tests covering argv[1]-as-gbrain,
argv[1]-as-cli.ts, and graceful error when no path resolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(autopilot): env-aware install + OpenClaw bootstrap injection

Expand installDaemon from 2 targets (macOS launchd, Linux crontab) to 4:

  - macos              → launchd plist (unchanged)
  - linux-systemd      → ~/.config/systemd/user/gbrain-autopilot.service
                         with Restart=on-failure, RestartSec=30, and an
                         is-system-running probe to confirm the user bus
                         actually works (Codex architecture #7 hardened —
                         the naive /run/systemd/system existence check was
                         a false-positive magnet)
  - ephemeral-container → detects RENDER / RAILWAY_ENVIRONMENT /
                          FLY_APP_NAME / /.dockerenv. Crontab is unreliable
                          here (wiped on deploy), so we write
                          ~/.gbrain/start-autopilot.sh and tell the user
                          to source it from their agent's bootstrap
  - linux-cron         → existing crontab path (unchanged)

detectInstallTarget() + --target flag for explicit override. Also:
  - --inject-bootstrap / --no-inject control OpenClaw ensure-services.sh
    auto-injection. Default is ON when OpenClaw is detected (OPENCLAW_HOME
    env var, openclaw.json in CWD or $HOME, or an ensure-services.sh
    found). Injection adds ONE line with a `# gbrain:autopilot v0.11.0`
    marker and writes .bak.<ISO-timestamp> before touching the file.
    Idempotent — the marker check prevents double injection.

uninstallDaemon mirrors all four targets. A user can now run
`gbrain autopilot --uninstall` after moving hosts (macOS laptop → Linux
server) and the uninstall will find + remove every artifact.

writeWrapperScript now uses resolveGbrainCliPath() instead of blindly
baking process.execPath into the wrapper script — on source installs
that path is the Bun runtime, not gbrain (Codex architecture #1 fix
propagated to the install path too).

test/autopilot-install.test.ts: 4 tests covering detectInstallTarget's
platform + env-var branches. Deeper E2E coverage (systemd unit file
contents, ephemeral start-script contents + exec bit, OpenClaw marker
injection + .bak) lives in Task 14's E2E fixture test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): v0.11.0 orchestrator — phases A through G, full implementation

Replaces the stub from commit de027ce. The orchestrator runs all seven
phases of the v0.11.0 Minions adoption migration idempotently, resumable
from any prior status:"partial" run (the stopgap bash script writes
those).

Phases:
  A. Schema  — `gbrain init --migrate-only` (NEVER bare `gbrain init`,
               which defaults to PGLite and clobbers existing configs —
               Codex H1 show-stopper).
  B. Smoke   — `gbrain jobs smoke`. Abort loudly on non-zero.
  C. Mode    — --mode flag wins. Preserved from prefs on resume. Non-TTY
               or --yes defaults pain_triggered with explicit print.
               Interactive: numbered 1/2/3 menu via shared promptLine.
  D. Prefs   — savePreferences({minion_mode, set_at, set_in_version}).
  E. Host    — AGENTS.md marker injection + cron manifest rewrites. For
               cron entries whose skill matches a gbrain builtin
               (sync/embed/lint/import/extract/backlinks/autopilot-cycle)
               rewrites kind:agentTurn → kind:shell with a
               gbrain jobs submit command. PGLite branch keeps --follow
               (inline execution, the only path that works without a
               worker daemon); Postgres branch drops --follow + adds
               --idempotency-key ${handler}:${slot} so long cron jobs
               don't stack up (same Codex fix as the autopilot-cycle
               dispatch). For non-builtin handlers (host-specific, like
               ea-inbox-sweep, frameio-scan, x-dm-triage) emits a
               structured TODO row to
               ~/.gbrain/migrations/pending-host-work.jsonl so the host
               agent can walk through plugin-contract work per
               skills/migrations/v0.11.0.md.
  F. Install — `gbrain autopilot --install --yes`. Best-effort (failure
               doesn't abort; user can run manually).
  G. Record  — append to completed.jsonl. status:"complete" unless
               pending_host_work > 0, in which case status:"partial" +
               apply_migrations_pending: true.

Safety guards (Codex code-quality tension #3: strict-skip, no rollback):
  - Scope: $HOME/.claude + $HOME/.openclaw only by default. --host-dir
    must be explicit to include $PWD or any other path.
  - Symlink escape: SKIP if the resolved target leaves the scoped root.
  - >1 MB files: SKIP with warning.
  - Permission denied: SKIP with warning; other files continue.
  - Malformed JSON manifest: SKIP with parse error logged; continue.
  - mtime re-check right before write: bail the file if changed between
    read + write; other files continue.
  - Every edit writes a .bak.<ISO-timestamp> sibling first (second-
    precision so two same-day runs don't collide).
  - Idempotency: `_gbrain_migrated_by: "v0.11.0"` JSON property marker
    on each rewritten cron entry (JSON can't have comments — Codex G);
    AGENTS.md marker `<!-- gbrain:subagent-routing v0.11.0 -->`.
  - TODO dedupe: JSONL appends deduped by (handler, manifest_path) so
    reruns don't grow the file.

Post-run summary: when pending_host_work > 0, prints a one-liner
pointing the user at the JSONL path + the v0.11.0 skill file. The skill
(Lane C-3 / C-4) is the host-agent instruction manual.

test/migrations-v0_11_0.test.ts: 18 tests covering:
  - AGENTS.md injection: happy path, .bak creation, idempotent rerun,
    --dry-run no-op, symlink-escape SKIP, >1MB SKIP.
  - Cron rewrite: builtin handlers rewrite to shell+gbrain jobs submit,
    non-builtins emit JSONL TODOs without touching the manifest, mixed
    manifests get both treatments in one pass, idempotent rerun, TODO
    dedupe, malformed JSON SKIP, no-entries-array SKIP, --dry-run no-op.
  - findAgentsMdFiles + findCronManifests: scoped walk to $HOME/.claude +
    $HOME/.openclaw, --host-dir opt-in for $PWD.
  - BUILTIN_HANDLERS frozen at the canonical 7 names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(skill): port skillify from Wintermute, pair with check-resolvable

Skillify is the "meta skill": turn any raw feature or script into a
properly-skilled, tested, resolvable, evaled unit of agent-visible
capability. Proven in production on Wintermute; paired with gbrain's
existing `check-resolvable` it becomes a user-controllable equivalent of
Hermes' auto-skill-creation — you decide when and what, the tooling
keeps the checklist honest.

Shipped:
  - skills/skillify/SKILL.md — ported from ~/git/wintermute/workspace/
    skills/skillify/SKILL.md. Genericized:
      * /data/.openclaw/workspace → \${PROJECT_ROOT} (runtime-detected).
      * services/voice-agent/__tests__/ → test/ (detected from repo).
      * Manual `grep skills/... AGENTS.md` replaced with a reference to
        `gbrain check-resolvable`, which does reachability + MECE + DRY
        + gap detection properly instead of grep-matching a path string.
  - scripts/skillify-check.ts — ported from
    ~/git/wintermute/workspace/scripts/skillify-check.mjs. Preserves the
    --recent flag and --json output shape. Detects project root via
    package.json walkup; detects test dir (test/ → __tests__/ → tests/
    → spec/). Runs the 10-item checklist per target and exits non-zero
    if any required item is missing.
  - test/skillify-check.test.ts — 4 CLI tests: happy-path against
    publish.ts (known-skilled), --json shape + schema, --recent smoke,
    bogus-target exit code.
  - skills/RESOLVER.md — adds the trigger row ("Skillify this", "is
    this a skill?", "make this proper") → skills/skillify/SKILL.md.
  - skills/manifest.json — adds the skillify entry so the conformance
    test passes.

Why the pair:
  * Hermes auto-creates skills in the background. Fine until you don't
    know what the agent shipped — checklists decay silently.
  * gbrain ships the same capability as two user-controlled tools:
    /skillify builds the checklist, gbrain check-resolvable validates
    reachability + MECE + DRY across the whole skill tree.
  * Human keeps judgment. Tooling keeps the checklist honest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(v0.11.1): cron-via-minions convention, plugin-handlers guide, minions-fix, skill updates

New reference docs:
  - skills/conventions/cron-via-minions.md — the rewrite convention for
    cron manifests. Shows the Postgres (fire-and-forget + idempotency-
    key) vs PGLite (--follow inline) branch; explains why builtin-only
    auto-rewrite is safe + how host-specific handlers get the plugin
    contract.
  - docs/guides/plugin-handlers.md — the plugin contract for host-
    specific Minion handlers. Code-level registration via import +
    worker.register(), not a data file (Codex D: handlers.json was an
    RCE surface). Concrete TypeScript skeleton + handler contract
    (ctx.data, ctx.signal, ctx.inbox) + full migration flow from TODO
    JSONL to a rewritten cron entry.
  - docs/guides/minions-fix.md — user-facing troubleshooting for
    half-migrated v0.11.0 installs. Paste-one-liner for the stopgap,
    gbrain apply-migrations path for v0.11.1+, verification commands,
    failure-mode recipes.

Rewrites + updates:
  - skills/migrations/v0.11.0.md — body restored as the host-agent
    instruction manual. Audience is the host agent reading
    ~/.gbrain/migrations/pending-host-work.jsonl after the CLI
    orchestrator has done the mechanical phases. Walks each TODO type
    through the 10-item skillify checklist (plugin contract, ship
    bootstrap, unit tests, integration tests, LLM evals, resolver
    trigger, trigger eval, E2E smoke, brain filing, check-resolvable).
    Reverses the earlier "delete the body" decision (1B) because the
    body serves a different audience now — host-agent, not CLI
    documentation.
  - skills/cron-scheduler/SKILL.md — Phase 4 ("Register with host
    scheduler") now references cron-via-minions + plugin-handlers.
  - skills/maintain/SKILL.md — new "Fix a half-migrated install"
    section with the apply-migrations recipe.
  - skills/setup/SKILL.md — new Phase C.5 "One-step autopilot +
    Minions install (v0.11.1+)" explaining the four install targets
    + the OpenClaw auto-injection default.
  - docs/GBRAIN_SKILLPACK.md — Operations section adds the three new
    guides + the subagent-routing and cron-routing SKILLPACK notes
    (v0.11.0+).

All 167 related tests (conformance + resolver + skillify-check + v0_11_0
orchestrator) stay green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.11.1): stopgap script + CLAUDE.md directive + README + CHANGELOG + version bump

scripts/fix-v0.11.0.sh — the paste-command for broken-v0.11.0 installs.
Released on the v0.11.1 tag so:
  curl -fsSL https://raw.githubusercontent.com/garrytan/gbrain/v0.11.1/scripts/fix-v0.11.0.sh | bash
always works (master branch could be renamed). 8 steps: schema apply,
smoke, mode prompt (non-TTY defaults pain_triggered), atomic write of
preferences.json (0o600), append completed.jsonl with status:"partial"
and apply_migrations_pending:true so the v0.11.1 apply-migrations run
resumes correctly (does NOT poison the permanent migration path —
Codex H2 avoidance), AGENTS.md + cron/jobs.json detection with guidance
printed as text only (never auto-edits from a curl-piped script), and a
closing line telling the user to run `gbrain autopilot --install` as the
one-stop finisher.

CLAUDE.md — new "Migration is canonical, not advisory" section pinning
the design principle. Any host-repo change (AGENTS.md, cron manifests,
launchctl units) is GBrain's responsibility via the migration; the
exception is host-specific handler registration, which goes via the
code-level plugin contract in docs/guides/plugin-handlers.md.

README.md — new sections:
  - "v0.11.0 migration didn't fire on your upgrade?" with both repair
    paths (v0.11.1 binary and pre-v0.11.1 stopgap).
  - "Skillify + check-resolvable: user-controllable auto-skill-creation"
    explaining why the user-controlled pair beats Hermes-style auto
    generation. Includes the scripts/skillify-check.ts invocation.

CHANGELOG.md — v0.11.1 entry (per CLAUDE.md voice: lead with what the
user can now do that they couldn't before; frame as benefits, not files
changed). Covers: mega-bug fix + apply-migrations + postinstall +
stopgap, autopilot-supervises-worker + single-install-step + env-aware
targets, Core fn extraction so handlers don't kill workers, skillify +
check-resolvable pair, host-agnostic plugin contract replacing
handlers.json (RCE concern), gbrain init --migrate-only, TS migration
registry + H8/H9 diff-rule fixes, CLAUDE.md directive. All Codex hard
blockers (H1, H3/H4, H5, H6, H7, H8, H9, K) + architecture issues
(#1/#2/#4/#5/#7) resolved.

package.json — version bump 0.11.0 → 0.11.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): migration-flow E2E against live Postgres + Bun env quirk fix

Ships test/e2e/migration-flow.test.ts — the end-to-end integration test
for the v0.11.0 orchestrator. Spins up against a live Postgres (gated
on DATABASE_URL per CLAUDE.md lifecycle) and exercises four scenarios:

  - Fresh install: schema apply (Phase A via `gbrain init --migrate-only`)
    → smoke (Phase B) → mode resolution (C) → prefs (D) → host rewrite
    (E, empty fixture) → record (G). Asserts preferences.json exists with
    0o600, completed.jsonl has a v0.11.0 entry, autopilot install was
    skipped per --no-autopilot-install.
  - Idempotent rerun: second orchestrator invocation on a completed
    install doesn't blow up; mode stays stable.
  - Host rewrite mixed manifest: 4-entry cron/jobs.json with 2 gbrain-
    builtin handlers (sync, embed) + 2 non-builtin (ea-inbox-sweep,
    morning-briefing). Asserts builtins rewrite to `gbrain jobs submit`
    kind:shell, non-builtins are LEFT on kind:agentTurn, and 2 JSONL
    TODOs are emitted with correct shape. AGENTS.md gets the marker
    injected. Status is "partial" because pending-host-work > 0.
  - Resumable: stopgap writes a partial completed.jsonl row first;
    orchestrator re-runs successfully against it and appends a new
    post-orchestrator entry. 1 partial + 1 complete = 2 rows total.

Critical fix surfaced by the E2E: src/commands/migrations/v0_11_0.ts's
three execSync calls (gbrain init --migrate-only, gbrain jobs smoke,
gbrain autopilot --install) now explicitly pass `env: process.env`.
Bun's execSync default does NOT propagate post-start `process.env.PATH`
mutations to subprocesses — only the initial PATH snapshot. Without the
explicit env, any user-side env tweak (e.g. setting GBRAIN_DATABASE_URL
in a script before calling the orchestrator) would be invisible to the
orchestrator's subprocesses. This is also the reason the E2E needs a
PATH shim installed at module-load time to expose the `gbrain` command.

test/init-migrate-only.test.ts: subprocess env now strips DATABASE_URL
and GBRAIN_DATABASE_URL. The "no config" error-path tests need
loadConfig() to return null, which it won't if the env-var fallback at
src/core/config.ts:30 fires. Before this fix, running the unit tests
with DATABASE_URL set (e.g. during an E2E run) caused false failures
because `gbrain init --migrate-only` saw the env var and succeeded.

Full test totals with live Postgres: 1265 pass, 0 fail, 3497 expect
calls, 67 files, ~95s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump VERSION file to 0.11.1

Commit 5c4cf1d bumped package.json version to 0.11.1 but missed the
root VERSION file. src/version.ts reads from package.json so
`gbrain --version` prints 0.11.1 correctly, but any tool or script
that reads the VERSION file directly (like /ship's idempotency check)
saw the stale 0.11.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.11.1): doctor self-heal check + skillpack-check command for cron health reports

Closes the discoverability hole from the v0.11.0 mega-bug: once a user is
on v0.11.1 (or later), every `gbrain doctor` invocation immediately
surfaces a half-migrated state, and `gbrain skillpack-check` gives host
agents (Wintermute's morning-briefing, any OpenClaw cron) a single
exit-coded JSON pipe to check from their own skills.

gbrain doctor — two new checks:
  1. Filesystem-only (fires on every `doctor` invocation, even --fast):
     if `~/.gbrain/migrations/completed.jsonl` has any status:"partial"
     entry with no matching status:"complete" for the same version, print
     `MINIONS HALF-INSTALLED (partial migration: vX.Y.Z). Run: gbrain
     apply-migrations --yes`. Typical cause is the stopgap wrote a
     partial record but nobody ran `apply-migrations` afterward.
  2. DB-path: if schema version is v7+ (Minions present) AND
     `~/.gbrain/preferences.json` is missing, print the same banner.
     Catches installs that never ran the stopgap or apply-migrations at
     all — the classic v0.11.0 "upgrade landed, migration never fired"
     state.

Both checks status:"fail" so doctor exits non-zero when either fires.
Test `test/doctor-minions-check.test.ts` pins the five branches
(partial present → FAIL, partial+complete → quiet, no-jsonl → quiet,
multiple versions named correctly, human-readable banner contains the
exact "MINIONS HALF-INSTALLED" phrase Wintermute's cron can grep for).

gbrain skillpack-check — new command + skill:
  - `src/commands/skillpack-check.ts` wraps `doctor --fast --json` +
    `apply-migrations --list` into one JSON report with `{healthy,
    summary, actions[], doctor, migrations}`. Exit 0 on healthy, 1 on
    action-needed, 2 on determine-failure. `--quiet` flag for cron
    pipes that want exit-code-only behavior.
  - `actions[]` is the remediation list. Doctor messages of the form
    `... Run: <cmd>` get their command extracted (regex fixed to match
    the full remainder of the line, not just the first word). Pending
    or partial migrations push `gbrain apply-migrations --yes` to the
    front of actions[].
  - `gbrainSpawn()` helper resolves the gbrain invocation correctly on
    compiled binary installs (`argv[1] = /usr/local/bin/gbrain`) AND
    source installs (`argv[1] = src/cli.ts`, prefix with `bun run`).
    Same Codex #1 fix pattern as autopilot's resolveGbrainCliPath.
  - `skills/skillpack-check/SKILL.md` teaches agents when to run it,
    what to do with the output, and anti-patterns (don't run without
    --quiet in a cron that emails; don't ignore exit 2).
  - Registered in skills/RESOLVER.md and skills/manifest.json.

Test `test/skillpack-check.test.ts` (5 tests) covers healthy fresh
install, half-migrated exit-1 with apply-migrations in actions[],
--quiet suppresses stdout in both states, --help prints usage, summary
includes top action when multiple are present.

1192 unit tests pass (+15 new). The 38 failing tests are all
DATABASE_URL E2Es — same pre-existing pattern, unchanged by this
commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(v0.11.1): reframe README + minions-fix — v0.11.0 was never released

v0.11.0 was cut but never released publicly. v0.11.1 is the first
public Minions ship, and fixes the upgrade-migration mega-bug so it
self-heals on every future `gbrain upgrade` + `bun update gbrain`.
The README was wrongly framing the fix as a retrospective for v0.11.0
users — none exist, so remove it.

README changes:
  - Delete the "v0.11.0 migration didn't fire on your upgrade?" section.
    Replace with "Health check and self-heal": the `gbrain doctor`,
    `gbrain skillpack-check --quiet`, and `gbrain skillpack-check | jq`
    recipes that ship in v0.11.1. Still links to docs/guides/minions-fix.md
    for deeper troubleshooting.
  - Promote the production benchmark to top billing. The previous section
    led with the lab benchmark (same LLM, localhost) and buried the
    production data point as a single follow-up sentence. Real deployment
    numbers are the stronger signal:
      * 753ms vs >10s gateway timeout (sub-agent couldn't even spawn)
      * $0.00 vs ~$0.03 per run
      * 100% vs 0% success rate under 19-cron production load
      * 36-month tweet backfill: 19,240 tweets, ~15 min, $0.00
    Lab numbers stay (separate table, labeled "controlled environment")
    so readers can see both layers.
  - Add the "The routing rule" closer: Deterministic → Minions, Judgment
    → Sub-agents. This is the clearest framing in the production
    benchmark doc and belongs in the README so readers leave with the
    right mental model. `minion_mode: pain_triggered` automates it.

docs/guides/minions-fix.md rewrite:
  - Reframe as: v0.11.0 never released, v0.11.1 is the first ship,
    `gbrain apply-migrations --yes` is canonical. Stopgap stays
    documented for pre-v0.11.1 branch builds (e.g. Wintermute's
    minions-jobs checkout before v0.11.1 tags).
  - Add the detection + verification commands (doctor + skillpack-check)
    at the top.
  - Cross-reference skills/skillpack-check/SKILL.md as the agent-facing
    health-check pattern.

Zero lingering "v0.11.0 released" references in README or minions-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(doctor): remove "schema v7+ no prefs → FAIL" check (too aggressive)

CI failure in Tier 1 Mechanical E2E:
  (fail) E2E: Doctor Command > gbrain doctor exits 0 on healthy DB

Root cause: the doctor half-migration detection added two checks. The
second check (`schema v7+ AND ~/.gbrain/preferences.json missing →
minions_config FAIL`) was too aggressive. It treated a valid fresh-
install state as broken.

`gbrain init` against Postgres applies schema v7 but doesn't write
preferences.json — that's the migration orchestrator's Phase D, which
only runs via `apply-migrations`. Between `init` finishing and the user
running `apply-migrations`, the install is legitimately in a
"schema-applied, no prefs" state. Doctor was exiting 1 on this valid
state, breaking the pre-existing CI test that init's + docters a
healthy DB.

Fix: drop the check. The filesystem check (step 3 — partial-completed
without a matching complete) is sufficient signal for genuine half-
migration. Added a regression test pinning the exact CI scenario: no
completed.jsonl present, no preferences.json, doctor must not fail any
minions_* check.

Also removes the now-unused `preferencesPaths` import.

Verified against live Postgres: CI-equivalent `gbrain doctor` + `gbrain
doctor --json` both pass. Full suite: 1281/1281 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(readme): Minions section — lead with the story, compress the rest

The previous section opened with "six daily pains" as a numbered list
before the hook, buried the production numbers halfway down, and had
a table explaining how each pain gets fixed. Fine for a spec doc;
wrong for a README that needs to land the impact fast.

Rewrite:
  - Lead with "your sub-agents won't drop work anymore" — the reason
    a reader is here.
  - Production numbers promoted, framed as a story: "Here's my
    personal OpenClaw deployment: one Render container, Supabase
    Postgres holding a 45,000-page brain, 19 cron jobs firing on
    schedule, the X Enterprise API on the wire..." Gives the reader
    the setup before the punchline.
  - The routing rule (deterministic → Minions, judgment → sub-agents)
    survives unchanged. It's the clearest framing in the whole section.
  - Lose the "how each pain gets fixed" table. Compress the six pains
    + their fixes into one paragraph that names the primitives by
    name (max_children, timeout_ms, child_done inbox, cascade cancel,
    idempotency keys, attachment validation). Readers who want depth
    click through to skills/minion-orchestrator/SKILL.md.
  - Close with "not incrementally better — categorically different"
    and the three headline numbers.
  - Drop the separate Lab Numbers table; the production numbers are
    stronger and the lab data is one click away via the link.

Lines: 75 → 42. Same signal, less scroll.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc: scrub X Enterprise API + @garrytan references from user-facing docs

User feedback: shouldn't name the specific enterprise-tier API product
or the account in the README or benchmark docs. Genericize:

  - "X Enterprise API on the wire" → drop entirely; the 19-cron load
    story carries the setup without naming the vendor
  - "X Enterprise API ($50K/mo firehose)" → "external API"
  - "@garrytan tweets" → "my social posts"
  - "Pull ~100 @garrytan tweets" → "Pull ~100 of my social posts"
  - "X Enterprise API (full-archive)" env var comment → "external API
    bearer token"

Scope:
  - README.md — the Minions production story line + scaling callout
  - docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md
  - docs/benchmarks/2026-04-18-tweet-ingestion.md

Plain "X API" references in the tweet-ingestion methodology stay —
those describe which public HTTP endpoint was called, not the
enterprise-tier product. Benchmark doc filenames (tweet-ingestion.md)
stay to preserve inbound links; content is genericized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(readme): Skillify section — match Minions energy, land the category shift

The previous section was competent but undersold what skillify actually
is. Rewrite matches the Minions section's shape: lead with the hook,
tell the story, land the punchline.

Key changes:
  - Title: "your skills tree stops being a black box." Names the thing
    skillify actually solves.
  - Open with the problem: Hermes auto-creates skills as a background
    behavior. Six months later you have an opaque pile nobody's read
    or tested. Make the liability concrete.
  - Promote the 10 items by name (SKILL.md + script + unit tests +
    integration tests + LLM evals + resolver trigger + trigger eval +
    E2E + brain filing + check-resolvable audit). Showing the list
    makes the scope of the unlock visible.
  - New subsection "Why this is the right answer for OpenClaw" names
    the debugging-the-black-box pain directly. Skillify makes the tree
    legible: when something breaks, you know which layer (contract,
    test, eval, trigger, or route) to inspect. When anything goes
    stale, check-resolvable flags it.
  - Close with "compounding quality instead of compounding entropy" +
    "not a nice-to-have. It's the piece that makes the skills tree
    survive six months."
  - Expand the code block to include `gbrain check-resolvable` (the
    other half of the pair) so readers see the whole workflow.

Length goes from 17 to 34 lines — still shorter than Minions, still
one section. Worth the space because this is a category shift for
how agent skills get built, not a feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: root <root@localhost>
@ab0991-oss
Copy link
Copy Markdown
Owner Author

Closing as superseded under GIT-1046. This pre-rebase staff branch conflicts with the v0.13 collector baseline; replacement work is carried forward on atlas PR #11 (no edits to staff/* branch history).

@ab0991-oss ab0991-oss closed this Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant